|
In statistics, a confounding variable (also confounding factor, a confound, or confounder) is an extraneous variable in a statistical model that correlates (directly or inversely) with both the dependent variable and the independent variable. A spurious relationship is a perceived relationship between an independent variable and a dependent variable that has been estimated incorrectly because the estimate fails to account for a confounding factor. The incorrect estimation suffers from omitted-variable bias. While specific definitions may vary, in essence a confounding variable fits the following four criteria, here given in a hypothetical situation with variable of interest "V", confounding variable "C" and outcome of interest "O": # C is associated (inversely or directly) with O # C is associated with O, independent of V # C is associated (inversely or directly) with V # C is not in the causal pathway of V to O (C is not a direct consequence of V, not a way by which V produces O) The preceding correlation-based definition, however, is metaphorical at best – a growing number of analysts agree that confounding is a causal concept, and as such, cannot be described in terms of correlations nor associations 〔Pearl, J., (2009). Simpson's Paradox, Confounding, and Collapsibility In ''Causality: Models, Reasoning and Inference'' (2nd ed.). New York : Cambridge University Press.〕〔VanderWeele, T.J. & Shpitser, I. (2013). On the definition of a confounder. ''Annals of Statistics'', 41:196-220.〕〔Greenland, S., Robins, J. M., & Pearl, J. (1999). Confounding and Collapsibility in Causal Inference. ''Statistical Science'', 14(1), 29–46.〕 (see causal definition). == Causal definition == The concept of confounding must be defined, and managed, in terms of the data generating model (as in the Figure above). Specifically, let X be some independent variable, Y some dependent variable. To estimate the effect of X on Y, the statistician must suppress the effects of extraneous variables that influence both X and Y. We say that, X and Y are confounded by some other variable Z whenever Z is a cause of both X and Y. In the causal framework, denote as the probability of event Y = y under the hypothetical intervention X = x. X and Y are not confounded if and only if the following holds: for all values X = x and Y = y, where is the conditional probability upon seeing X = x. Intuitively, this equality states that X and Y are not confounded whenever the observationally witnessed association between them is the same as the association that would be measured in a controlled experiment, with x randomized. In principle, the defining equality P(y|do(x)) = P(y|x) can be verified from the data generating model assuming we have all the equations and probabilities associated with the model. This is done by simulating an intervention do(X = x) (see Bayesian Networks) and checking whether the resulting probability of Y equals the conditional probability P(y|x). It turns out, however, that graph structure alone is sufficient for verifying the equality P(y|do(x)) = P(y|x) which is guaranteed to hold whenever X and Y do not share a common ancestor. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Confounding」の詳細全文を読む スポンサード リンク
|